Processor Pool - Based Schedulingfor Large - Scale NUMA
نویسندگان
چکیده
Large-scale Non-Uniform Memory Access (NUMA) multiprocessors are gaining increased attention due to their potential for achieving high performance through the replication of relatively simple components. Because of the complexity of such systems, scheduling algorithms for parallel applications are crucial in realizing the performance potential of these systems. In particular, scheduling methods must consider the scale of the system, with the increased likelihood of creating bottlenecks, along with the NUMA characteristics of the system, and the beneets to be gained by placing threads close to their code and data. We propose a class of scheduling algorithms based on processor pools. A processor pool is a software construct for organizing and managing a large number of processors by dividing them into groups called pools. The parallel threads of a job are run in a single processor pool, unless there are performance advantages for a job to span multiple pools. Several jobs may share one pool. Our simulation experiments show that processor pool-based scheduling may eeectively reduce the average job response time. The performance improvements attained by using processor pools increase with the average parallelism of the jobs, the load level of the system, the diierentials in memory access costs, and the likelihood of having system bottlenecks. As the system size increases, while maintaining the workload composition and intensity, we observed that processor pools can be used to provide signiicant performance improvements. We therefore conclude that processor pool-based scheduling may be an eeective and eecient technique for scalable systems.
منابع مشابه
Multiprogrammed Parallel Application Scheduling in NUMA Multiprocessors
The invention, acceptance, and proliferation of multiprocessors are primarily a result of the quest to increase computer system performance. The most promising features of multiprocessors are their potential to solve problems faster than previously possible and to solve larger problems than previously possible. Large-scale multiprocessors offer the additional advantage of being able to execute ...
متن کاملAn Experimental Evaluation of Processor Pool-Based Scheduling for Shared-Memory NUMA Multiprocessors
In this paper we describe the design, implementation and experimental evaluation of a technique for operating system schedulers called processor pool-based scheduling [51]. Our technique is designed to assign processes (or kernel threads) of parallel applications to processors in multiprogrammed, shared-memory NUMA multiprocessors. The results of the experiments conducted in this research demon...
متن کاملClassifying and alleviating the communication overheads in matrix computations on large-scale NUMA multiprocessors
Large-scale, shared-memory multiprocessors have non-uniform memory access (NUMA) costs. The high communication cost dominates the source of matrix computations' execution. Memory contention and remote memory access are two major communication overheads on large-scale NUMA multiprocessors. However, previous experiments and discussions focus either on reducing the number of remote memory accesses...
متن کاملTask Parallel Models Based on Dynamic Data Placement to Reduce NUMA Effects
NUMA (Non-Uniform Memory Access) multicore computers become popular in scientific and industrial fields due to its scalable memory performance. However, large-scale intensive data computing on NUMA architecture are facing up to the challenges in data locality problems called NUMA effects that are caused by the overhead accesses of cross-node data. Our task parallel model bases on the strategy o...
متن کاملAddressing Processor Over-provisioning on Large-scale Multi-core Platforms
Modern micro-architectures have embraced multi-core processors and thread-level parallelism for performance growth, because of the difficulty of increasing single core performance without significantly increasing processor power consumption. To meet the ever growing need for speed, current large-scale computing platforms are Nonuniform Memory Accesses (NUMA) architectures equipped with dozens o...
متن کامل